34 research outputs found

    Technologies respectueuses de la vie privée pour le covoiturage

    Get PDF
    L'émergence des téléphones mobiles et objets connectés a profondément changé notre vie quotidienne. Ces dispositifs, grâce à la multitude de capteurs qu'ils embarquent, permettent l'accès à un large spectre de services. En particulier, les capteurs de position ont contribué au développement des services de localisation tels que la navigation, le covoiturage, le suivi de la congestion en temps réel... En dépit du confort offert par ces services, la collecte et le traitement des données de localisation portent de sérieuses atteintes à la vie privée des utilisateurs. En effet, ces données peuvent renseigner les fournisseurs de services sur les points d'intérêt (domicile, lieu de travail, orientation sexuelle), les habitudes ainsi que le réseau social des utilisateurs. D'une façon générale, la protection de la vie privée des utilisateurs peut être assurée par des dispositions légales ou techniques. Même si les mesures d'ordre légal peuvent dissuader les fournisseurs de services et les individus malveillants à enfreindre le droit à la vie privée des utilisateurs, les effets de telles mesures ne sont observables que lorsque l'infraction est déjà commise et détectée. En revanche, l'utilisation des technologies renforçant la protection de la vie privée (PET) dès la phase de conception des systèmes permet de réduire le taux de réussite des attaques contre la vie privée des utilisateurs. L'objectif principal de cette thèse est de montrer la viabilité de l'utilisation des PET comme moyens de protection des données de localisation dans les services de covoiturage. Ce type de service de localisation, en aidant les conducteurs à partager les sièges vides dans les véhicules, contribue à réduire les problèmes de congestion, d'émissions et de dépendance aux combustibles fossiles. Dans cette thèse, nous étudions les problèmes de synchronisation d'itinéraires et d'appariement relatifs au covoiturage avec une prise en compte explicite des contraintes de protection des données de localisation (origine, destination). Les solutions proposées dans cette thèse combinent des algorithmes de calcul d'itinéraires multimodaux avec plusieurs techniques de protection de la vie privée telles que le chiffrement homomorphe, l'intersection sécurisée d'ensembles, le secret partagé, la comparaison sécurisée d'entier. Elles garantissent des propriétés de protection de vie privée comprenant l'anonymat, la non-chainabilité et la minimisation des données. De plus, elles sont comparées à des solutions classiques, ne protégeant pas la vie privée. Nos expérimentations indiquent que les contraintes de protection des données privées peuvent être prise en compte dans les services de covoiturage sans dégrader leurs performances.The emergence of mobile phones and connected objects has profoundly changed our daily lives. These devices, thanks to the multitude of sensors they embark, allow access to a broad spectrum of services. In particular, position sensors have contributed to the development of location-based services such as navigation, ridesharing, real-time congestion tracking... Despite the comfort offered by these services, the collection and processing of location data seriously infringe the privacy of users. In fact, these data can inform service providers about points of interests (home, workplace, sexual orientation), habits and social network of the users. In general, the protection of users' privacy can be ensured by legal or technical provisions. While legal measures may discourage service providers and malicious individuals from infringing users' privacy rights, the effects of such measures are only observable when the offense is already committed and detected. On the other hand, the use of privacy-enhancing technologies (PET) from the design phase of systems can reduce the success rate of attacks on the privacy of users. The main objective of this thesis is to demonstrate the viability of the usage of PET as a means of location data protection in ridesharing services. This type of location-based service, by allowing drivers to share empty seats in vehicles, helps in reducing congestion, CO2 emissions and dependence on fossil fuels. In this thesis, we study the problems of synchronization of itineraries and matching in the ridesharing context, with an explicit consideration of location data (origin, destination) protection constraints. The solutions proposed in this thesis combine multimodal routing algorithms with several privacy-enhancing technologies such as homomorphic encryption, private set intersection, secret sharing, secure comparison of integers. They guarantee privacy properties including anonymity, unlinkability, and data minimization. In addition, they are compared to conventional solutions, which do not protect privacy. Our experiments indicate that location data protection constraints can be taken into account in ridesharing services without degrading their performance

    Fairness Under Demographic Scarce Regime

    Full text link
    Most existing works on fairness assume the model has full access to demographic information. However, there exist scenarios where demographic information is partially available because a record was not maintained throughout data collection or due to privacy reasons. This setting is known as demographic scarce regime. Prior research have shown that training an attribute classifier to replace the missing sensitive attributes (proxy) can still improve fairness. However, the use of proxy-sensitive attributes worsens fairness-accuracy trade-offs compared to true sensitive attributes. To address this limitation, we propose a framework to build attribute classifiers that achieve better fairness-accuracy trade-offs. Our method introduces uncertainty awareness in the attribute classifier and enforces fairness on samples with demographic information inferred with the lowest uncertainty. We show empirically that enforcing fairness constraints on samples with uncertain sensitive attributes is detrimental to fairness and accuracy. Our experiments on two datasets showed that the proposed framework yields models with significantly better fairness-accuracy trade-offs compared to classic attribute classifiers. Surprisingly, our framework outperforms models trained with constraints on the true sensitive attributes.Comment: 14 pages, 7 page

    Probabilistic Dataset Reconstruction from Interpretable Models

    Full text link
    Interpretability is often pointed out as a key requirement for trustworthy machine learning. However, learning and releasing models that are inherently interpretable leaks information regarding the underlying training data. As such disclosure may directly conflict with privacy, a precise quantification of the privacy impact of such breach is a fundamental problem. For instance, previous work have shown that the structure of a decision tree can be leveraged to build a probabilistic reconstruction of its training dataset, with the uncertainty of the reconstruction being a relevant metric for the information leak. In this paper, we propose of a novel framework generalizing these probabilistic reconstructions in the sense that it can handle other forms of interpretable models and more generic types of knowledge. In addition, we demonstrate that under realistic assumptions regarding the interpretable models' structure, the uncertainty of the reconstruction can be computed efficiently. Finally, we illustrate the applicability of our approach on both decision trees and rule lists, by comparing the theoretical information leak associated to either exact or heuristic learning algorithms. Our results suggest that optimal interpretable models are often more compact and leak less information regarding their training data than greedily-built ones, for a given accuracy level

    Adversarial training approach for local data debiasing

    Full text link
    The widespread use of automated decision processes in many areas of our society raises serious ethical issues concerning the fairness of the process and the possible resulting discriminations. In this work, we propose a novel approach called GANsan whose objective is to prevent the possibility of any discrimination i.e., direct and indirect) based on a sensitive attribute by removing the attribute itself as well as the existing correlations with the remaining attributes. Our sanitization algorithm GANsan is partially inspired by the powerful framework of generative adversarial networks (in particular the Cycle-GANs), which offers a flexible way to learn a distribution empirically or to translate between two different distributions. In contrast to prior work, one of the strengths of our approach is that the sanitization is performed in the same space as the original data by only modifying the other attributes as little as possible and thus preserving the interpretability of the sanitized data. As a consequence, once the sanitizer is trained, it can be applied to new data, such as for instance, locally by an individual on his profile before releasing it. Finally, experiments on a real dataset demonstrate the effectiveness of the proposed approach as well as the achievable trade-off between fairness and utility

    Robustness in Machine Learning Explanations: Does It Matter?

    Get PDF
    The explainable AI literature contains multiple notions of what an explanation is and what desiderata explanations should satisfy. One implicit source of disagreement is how far the explanations should reflect real patterns in the data or the world. This disagreement underlies debates about other desiderata, such as how robust explanations are to slight perturbations in the input data. I argue that robustness is desirable to the extent that we’re concerned about finding real patterns in the world. The import of real patterns differs according to the problem context. In some contexts, non-robust explanations can constitute a moral hazard. By being clear about the extent to which we care about capturing real patterns, we can also determine whether the Rashomon Effect is a boon or a bane

    Privacy Enhancing Technologies for Ridesharing

    No full text
    International audienceThe ubiquitous world in which we live has fostered the development of technologies that take into account the context in which users are using them to deliver high quality services. For example, by providing personalized services based on the positions of users, Location-based services (LBS) encourage the emergence of ridesharing services. This success has come at the cost of user privacy. Indeed in current ridesharing services, users are not in control of their own location data and have to trust the ridesharing operators with the management of their data. In this paper we aim at developing privacy preserving ridesharing systems. Our first experiments proved that privacy could be improved without scarifying the utility of the delivered service
    corecore